Chapter 1: Linear Mixed Models

Include here an introduction to linear mixed models (https://stats.idre.ucla.edu/other/mult-pkg/introduction-to-linear-mixed-models/).

Wiki (https://en.wikipedia.org/wiki/Mixed_model).

Background

Some useful background for the following topic is: (doi: 10.2134/agronj2019.02.0135).

Example: soybean row spacing meta-analysis

This first example is for soybean row spacing. The question is: is there a difference between growing soybean using rows 15 inches apart vs. 30 in apart? The planting density is assumed to be the same so only the configuration of the planting scheme was changed. The experimental design is that of paired strips, but it is organized in different trials located in different farms. Each farm/field/trial has unique soils, management history and experiences their own weather conditions. They are all located in Iowa and each season will exert similar effects (e.g. dry vs. wet). It is common to analyze the data as the response ration, or the yield of the treatment over the yield of the control. Taking the log is common practice.

\[ RR = \frac{Yield_T}{Yield_C} \]

\[ lrr = \log(RR) \]

soy <- read.csv("./ch1/srs.csv")[,c(1,4,7,9)]
## Select only a subset of data
soy$lrr <- log(soy$TRT1_Yld/soy$TRT2_Yld)
soy.o <- soy[order(soy$lrr),]
soy.o$id <- 1:nrow(soy.o)

Visually, it is hard to make a definite statement, but it would seem that the practice is highly beneficial. We could calculate how many observations were positive and see that alost 60% of the observations have a positive outcome. Is this enough to recommend the use of the practice? Answer: no, an economic analysis would be a better tool to make decisions.

round(dim(subset(soy.o, lrr > 0))[1]/nrow(soy.o),2)
## [1] 0.6

There is a clear break which suggests that some of these trials are quite different from the rest. Which variable explains these differences? Almost all of the high values were from year 2014. In addition, we should also investigate how do the observations group with respect to studies.

Almost all of the observations which have a large response are from a single study. It is not uncommon to throw this out and call it an ‘outlier’, but we don’t have a reason for doing this at the moment. What if it truly represents a situation that can occurr again? We should consider it as part of the decision making process.

Confidence intervals for each individual study

cif <- function(x, alpha=0.05){
  n.x <- length(x)
  se.x <- sd(x)/sqrt(n.x)
  m.x <- mean(x)
  qv <- qt(1-alpha/2, n.x-1)
  lci <- m.x -  qv * se.x
  uci <- m.x + qv * se.x
  ans <- c(lci, m.x, uci)
  names(ans) <- c("lci", "m", "uci")
  return(ans)
}
soy.a <- aggregate(lrr ~ Trial_ID, data = soy.o, FUN = cif)
soy.a2 <- data.frame(trial = soy.a$Trial_ID, soy.a$lrr)
## Weirdest hack ever
soy.a2.o <- soy.a2[order(soy.a2$m),]
soy.a2.o$id <- 1:nrow(soy.a2.o)

soy.tn <- aggregate(lrr ~ Trial_ID, data = soy.o, FUN = length)
names(soy.tn) <- c("trial","n")

soy.a3 <- merge(soy.a2.o, soy.tn)
  
ggplot(data = soy.a3, aes(x = m, y = id)) + 
  geom_point() + xlab("lrr") +
  geom_errorbarh(aes(xmin = lci, xmax = uci), height = 0.2) + 
  geom_vline(xintercept = 0) + 
  geom_text(aes(x = 0.265, y = id), label = as.character(soy.a3$n))

The variability in the confidence intervals is a reflection of both the underlying data and the sample size within each trial. Trials with a smaller sample size tend to have wider confidence intervals.

## Number of trials with a positive response (ci does not include zero) 
## - proportion
sstgz <- subset(soy.a3, lci > 0)
cat("Trials sig. > 0:",round(nrow(sstgz)/nrow(soy.a3),2),"\n")
## Trials sig. > 0: 0.11
sstlz <- subset(soy.a3, uci < 0)
cat("Trials sig. < 0:", round(nrow(sstlz)/nrow(soy.a3),2),"\n")
## Trials sig. < 0: 0.06
cat("Trials incl. 0:",round(1 - (nrow(sstgz) + nrow(sstlz))/nrow(soy.a3),2),"\n")
## Trials incl. 0: 0.83

Now only 11% of the trials had a significant positive response, about 6% had a significant negative response and the rest, 83%, had a response for which the confidence interval included zero. This is much less promising than the 60% we observed previously, but now we are counting the number of trials which had a positive reponse instead of the number of observations. It is somewhat ironic that the trial with the highest number of reps (the one with the highest precision) is the one that deviates the most with the observed response in the rest of the trials. This trial experienced hail, which is likely involed in the observed response. One possible explanation for the one trial which had a negative response could be due to higher moisture in the canopy (in narrow rows) which leads to higher disease pressure, particularly white mold.

Linear Model

The disadvantage of the previous analysis is that we are not analyzing the data together. If we want to make a statement about the overall performance of the management practice, combining the trials is necessary. In the following steps we will gradually increase the complexity of the analysis. The first model (lm0) does not include trial, it analyzes the data ignaring its structure.

## Fitting a first model without trial, just the intercept
soy.lm0 <- lm(lrr ~ 1, data = soy.o)
## Parameter confidence interval
pp.lm0 <- predict(soy.lm0, 
                  newdata = data.frame(lrr = mean(soy.o$lrr)), 
                  interval = "confidence")
## Individual observation confidence interval
ppp.lm0 <- predict(soy.lm0, 
                   newdata = data.frame(lrr = mean(soy.o$lrr)), 
                   interval = "prediction")

A second step would be to include ‘trial’ in the model, but still in the context of a linear model. This model tests whether there is an effect of ‘trial’ (in this case it is significant); at least one trial has an effect which is different from the rest.

## Fitting a first model without trial
## options(constrasts = rep(c("contr.treatment"),2))
## Need to look into defining other contrasts
soy.lm1 <- lm(lrr ~ Trial_ID, data = soy.o)
pp.lm1 <- predict(soy.lm1, 
                  newdata = data.frame(Trial_ID = unique(soy.o$Trial_ID)),
                  interval = "confidence")
pp.lm1d <- data.frame(trial = unique(soy.o$Trial_ID), pp.lm1)
pp.lm1d2 <- merge(soy.a3, pp.lm1d)
## prediciton interval for the trial which is closer to the mean
pmt <- which.min(abs(pp.lm1d$fit - coef(soy.lm0)))
## The previous is potentially dangerous, because the actual mean might
## be far away from any trial in particular
ppp.lm1 <- predict(soy.lm1, 
              newdata = data.frame(Trial_ID = unique(soy.o$Trial_ID)[pmt]),
                                          interval = "prediction")
ppp.lm1 <- as.data.frame(ppp.lm1)

The main difference now that we analyze the data combined is that the confidence intervals for individual trials are narrower. The reason is that we assume homogeneous variance and we use the data from all the trials to estimate a single variance. For most of the trials the precision in the confidence interval has improved (it is narrower). Since the sample size for each trial is different, the intervals are different. Had the sample sizes for all trials been the same, then the width of the intervals would have been exactly the same for all trials. The prediciton interval for a single trial (close to the average) is shown and it is slightly narrower than for the intercept of the lm0 (no trial) model.

Linear Mixed Models

We have more or less exhausted what we can do using linear models for this example (not really). Now we move on to mixed models and we consider ‘trials’ as ‘random’. This means that we are less focus on what has happened on each individual trial and we think of trials as being ‘sampled’ from a larger population. When considered factors as fixed we assume that we can replicate that level, but here each trial is an ocurrence which we cannot easily replicate.

soy.lme <- lmer(lrr ~ 1 + (1|Trial_ID), data = soy.o)
round(exp(fixef(soy.lme)),3)
## (Intercept) 
##       1.014
round(exp(confint(soy.lme, quiet = T))[3,],3)
##  2.5 % 97.5 % 
##  0.983  1.046
##exp(confint(soy.lme))
##exp(confint(soy.lme, method = "boot"))
##For these type of models there are different methods for 
##computing confidence intervals ("profile","Wald","boot").

## Confidence intervals and predictions
ci.lme <- confint(soy.lme, quiet = T)[3,]
pp.lme <- c(fixef(soy.lme),ci.lme[1],ci.lme[2])

These results point to a 1.4% increase in yield when using the narrower row spacing. The confidence interval provides additional support that this is a small increase. The point estimates for the variances are \(\hat{\sigma}_B =\) 0.0602765 and \(\hat{\sigma} =\) 0.058963 . The intra-class correlation is \(0.06028^2/(0.06028^2 + 0.05896^2) = 0.51\).

The confidence interval for the intercept of the model is only slightly narrower than the one computed with the linear model (lm0), but its location is much closer to zero to the point that it overlaps the zero line. The effect is much smaller than what we had anticipated before. This analysis has the characteristic that it gives less weight to extreme observations.

Following Higgins et al. (2009) we can calculate the prediction interval

\[ \hat{\mu} \pm t^{\alpha}_{k-2} \sqrt{\hat{\tau}^2 + SE(\hat{\mu})^2} \] \[ \hat{\mu} = \text{estimated mean} \] \[ \hat{\tau} = \text{between studies std. deviation} \] \[ SE(\hat{\mu}) = \text{standard error for the intercept} \]

n.k <- length(soy.a3$trial)
t.val <- qt(1 - 0.05/2, n.k-2)
tau.sqrd <- VarCorr(soy.lme)$Trial_ID[[1]]
se.mu <- unlist(summary(soy.lme))$coefficients2
se.muh <- sqrt(tau.sqrd + se.mu^2)
pdi <- t.val * se.muh
mu.lci <- fixef(soy.lme) - pdi
mu.uci <- fixef(soy.lme) + pdi

Bayesian model using MCMCglmm

In this first example we are using the pacakge ‘MCMCglmm’. We do not provide priors and these are created internally, but they are uninformative. In the following chapter there is an example of doing Bayesian meta-analysis using JAGS.

soy.mc <- MCMCglmm(lrr ~ 1, random = ~ Trial_ID, pr = TRUE, 
                  data = soy.o, verbose = FALSE, DIC = TRUE)
soy.mc.ci <- as.vector(summary(soy.mc)$solutions)[1:3]
## soy.mc.ci <- predict(soy.mc, interval = "confidence")[1,]
## Those two lines are equivalent

## With non informative priors the results are almost identical 
## to the lme solution
round(exp(rbind(pp.lme, soy.mc.ci)),3)
##           (Intercept) 2.5 % 97.5 %
## pp.lme          1.014 0.983  1.046
## soy.mc.ci       1.015 0.982  1.050
## Calculating a predictive interval using MCMCglmm
## the predict function marginalizes over random effects
## Each instance is generated through simulate
## At this point it is not possible to fix the seed
##soy.mc.pi0 <- predict(soy.mc, interval = "confidence")
##soy.mc.pi2 <- predict(soy.mc, marginal = NULL, 
##                                  interval = "confidence")
##matplot(1:nrow(soy.o),soy.mc.pi0, type = 'l')
##matplot(1:nrow(soy.o),soy.mc.pi2, type = 'l')
ndat <- data.frame(lrr = mean(soy.o$lrr), 
                   Trial_ID = unique(soy.o$Trial_ID))
soy.mc.pi <- colMeans(predict(soy.mc, newdata = ndat,
                              marginal = NULL,
                              interval = "prediction"))

soy.mc.pi <- data.frame(fit = soy.mc.pi[1],
                        lwr = soy.mc.pi[2],
                        upr = soy.mc.pi[3])
round(exp(soy.mc.pi),3)

Bayesian model using JAGS

soyd <- list(lrr = soy.o$lrr, trial = soy.o$Trial_ID,
           t.n = length(unique(soy.o$Trial_ID)), 
           n = nrow(soy.o))
init <- list(mu = 0, tau = 0.01, tau.trial = 0.01)
modelstring="
  model {
    # Single intercept model likelihood
    for (i in 1:n) {
      lrr[i]~dnorm(mu + b[trial[i]],tau)
    }

    # trial effect 
    for(j in 1:t.n){
      b[j] ~ dnorm(0, tau.trial)
    }
    
    # priors
    mu ~ dnorm(0,0.00001) # intercept prior
    tau ~ dgamma(0.0001,0.0001) ## tau is the residual precision
    sigma <- 1.0/sqrt(tau)
    tau.trial ~ dgamma(0.0001, 0.0001)
    sigma.trial <- 1.0/sqrt(tau.trial)
    # generate predictions 
    beff ~ dnorm(0, tau.trial)
    pred <- mu + beff
  }
"
mdl=jags.model(textConnection(modelstring), data=soyd, inits=init, 
               n.chains = 2)
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 119
##    Unobserved stochastic nodes: 22
##    Total graph size: 289
## 
## Initializing model
update(mdl,n.iter = 5000, n.burnin=2000)
output=coda.samples(model=mdl, 
                    variable.names=c("mu","pred","sigma","sigma.trial"), 
                    n.iter=10000, thin=10)
print(summary(output))
## 
## Iterations = 5010:15000
## Thinning interval = 10 
## Number of chains = 2 
## Sample size per chain = 1000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##                Mean       SD  Naive SE Time-series SE
## mu          0.01435 0.016757 3.747e-04      0.0005431
## pred        0.01586 0.064676 1.446e-03      0.0014763
## sigma       0.05942 0.004221 9.439e-05      0.0000970
## sigma.trial 0.06319 0.012506 2.796e-04      0.0003265
## 
## 2. Quantiles for each variable:
## 
##                 2.5%      25%     50%     75%   97.5%
## mu          -0.01935  0.00358 0.01434 0.02505 0.04748
## pred        -0.11540 -0.02615 0.01747 0.05636 0.14687
## sigma        0.05196  0.05644 0.05901 0.06204 0.06866
## sigma.trial  0.04395  0.05441 0.06158 0.06984 0.09340
plot(output, trace = FALSE)

plot(output, density = FALSE)

mcmc.chains <- as.data.frame(unclass(output[[1]]))

## Process data for graph
## Mean and intervals for mu
soy.mcj.m <- quantile(mcmc.chains$mu, c(0.025,0.5,0.975))
soy.mcj.p <- quantile(mcmc.chains$pred, c(0.025,0.5,0.975))

ggplot(data = mcmc.chains) + 
  geom_density(aes(x = mu, y = ..density..), color = "blue")+
  geom_density(aes(x = pred, y = ..density..), color = "orange") + 
  geom_text(aes(x = 0.15, y = 20), label = "mean distribution", 
            color = "blue") + 
  geom_text(aes(x = 0.15, y = 15), label = "observation distribution",
            color = "orange") 

Comparing model variance components

mthd <- c("lm","lmer","mcmc.glmm","jags")
cpsm <- data.frame(method = mthd, sigma.trial = NA, sigma = NA)
## values for lm
cpsm[1,2:3] <- c(NA, sigma(soy.lm0))
## values for lmer
cpsm[2,2:3] <- c(attr(VarCorr(soy.lme)$Trial_ID,"stddev"), sigma(soy.lme))
## values for MCMCglmm
cpsm[3,2:3] <- sqrt(summary(soy.mc$VCV)[[2]][,3])
## values for rjags
cpsm[4,2:3] <- summary(output)[[2]][c(4,3),3]
kable(cpsm, caption = "Comparing between and within trial std. deviations")
Comparing between and within trial std. deviations
method sigma.trial sigma
lm NA 0.1082348
lmer 0.0602765 0.0589630
mcmc.glmm 0.0618134 0.0593398
jags 0.0615761 0.0590105

Testing our assumptions on a different dataset

It seems important to be able to test our assumptions using a different dataset which has an important difference between our predictions for the lmer vs. mcmc case.

One such example seems to be for ‘foliar fungicide Stratego YLD’.

sfs <- read.csv("./ch2/StrategoYldOnSoybean.csv")
sfs$lrr <- log(with(sfs, TRT_Yld/CTR_Yld))
## The lowest value is very different from the rest
summary(sfs$lrr)
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## -0.250709 -0.009848  0.024904  0.028104  0.063291  0.254443
## Plot by averaging
sfs.a <- aggregate(lrr ~ Trial_ID, data = sfs, FUN = mean)
sfs.a$max <- aggregate(lrr ~ Trial_ID, data = sfs, FUN = max)$lrr
sfs.a$min <- aggregate(lrr ~ Trial_ID, data = sfs, FUN = min)$lrr

sfs.ao <- sfs.a[order(sfs.a$lrr),]
sfs.ao$trial <- 1:nrow(sfs.ao)

sfs.lm0 <- lm(lrr ~ 1, data = sfs)
sfs.lm <- lm(lrr ~ Trial_ID, data = sfs)
sfs.lmer <- lmer(lrr ~ 1 + (1 | Trial_ID), data = sfs)
## with this particular dataset the between trial variance is 0
set.seed(12345)
## Default priors
sfs.mcmc1 <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs,
                      pr = TRUE, verbose = FALSE)
sfs.mcmc2 <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs,
                      pr = TRUE, verbose = FALSE)
sfs.mcmc3 <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs, 
                      pr = TRUE, verbose = FALSE)
sfs.mcmc.v <- mcmc.list(sfs.mcmc1$VCV, sfs.mcmc2$VCV, sfs.mcmc3$VCV)
sfs.mcmc.m <- mcmc.list(sfs.mcmc1$Sol, sfs.mcmc2$Sol, sfs.mcmc3$Sol)
gelman.diag(sfs.mcmc.v)
## Potential scale reduction factors:
## 
##          Point est. Upper C.I.
## Trial_ID       1.36       3.24
## units          1.00       1.00
## 
## Multivariate psrf
## 
## 1.1
plot(sfs.mcmc.v, density = FALSE)

set.seed(123456789)
prior1 <- list(R = list(V = 1, nu = 0.001), G = list(G1=list(V = 1, nu = 0.001)))
## These are 'non-informative' priors
sfs.mcmc1.2 <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs, 
                      prior = prior1, pr = TRUE, verbose = FALSE)
sfs.mcmc2.2 <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs, 
                      prior = prior1, pr = TRUE, verbose = FALSE)
sfs.mcmc3.2 <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs, 
                      prior = prior1, pr = TRUE, verbose = FALSE)
sfs.mcmc.v2 <- mcmc.list(sfs.mcmc1.2$VCV, sfs.mcmc2.2$VCV, sfs.mcmc3.2$VCV)
sfs.mcmc.m2 <- mcmc.list(sfs.mcmc1.2$Sol, sfs.mcmc2.2$Sol, sfs.mcmc3.2$Sol)
gelman.diag(sfs.mcmc.v2)
## Potential scale reduction factors:
## 
##          Point est. Upper C.I.
## Trial_ID          1       1.01
## units             1       1.00
## 
## Multivariate psrf
## 
## 1
plot(sfs.mcmc.v2, density = FALSE)

set.seed(123456)
prior2 <- list(R = list(V = 0.0013978, nu = 1.492583), 
             G = list(G1=list(V = 0.0013978*diag(1), nu = 0.492583)))
## These are 'non-informative' priors
sfs.mcmc1.3 <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs, 
                      prior = prior2, pr = TRUE, verbose = FALSE)
sfs.mcmc2.3 <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs, 
                      prior = prior2, pr = TRUE, verbose = FALSE)
sfs.mcmc3.3 <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs, 
                      prior = prior2, pr = TRUE, verbose = FALSE)
sfs.mcmc.v3 <- mcmc.list(sfs.mcmc1.3$VCV, sfs.mcmc2.3$VCV, sfs.mcmc3.3$VCV)
sfs.mcmc.m3 <- mcmc.list(sfs.mcmc1.3$Sol, sfs.mcmc2.3$Sol, sfs.mcmc3.3$Sol)
gelman.diag(sfs.mcmc.v3)
## Potential scale reduction factors:
## 
##          Point est. Upper C.I.
## Trial_ID          1       1.01
## units             1       1.01
## 
## Multivariate psrf
## 
## 1
plot(sfs.mcmc.v3, density = FALSE)

sfsd <- list(lrr = sfs$lrr, trial = sfs$Trial_ID,
           t.n = length(unique(sfs$Trial_ID)), 
           n = nrow(sfs))
init <- list(mu = 0, tau = 0.01, tau.trial = 0.01)
modelstring="
  model {
    # Single intercept model likelihood
    for (i in 1:n) {
      lrr[i]~dnorm(mu + b[trial[i]],tau)
    }

    # trial effect 
    for(j in 1:t.n){
      b[j] ~ dnorm(0, tau.trial)
    }
    
    # priors
    mu ~ dnorm(0,0.00001) # intercept prior
    tau ~ dgamma(0.0001,0.0001) ## tau is the residual precision
    sigma <- 1.0/sqrt(tau)
    tau.trial ~ dgamma(0.0001, 0.0001)
    sigma.trial <- 1.0/sqrt(tau.trial)
    # generate predictions 
    beff ~ dnorm(0, tau.trial)
    pred <- mu + beff
  }
"
mdl=jags.model(textConnection(modelstring), data=sfsd, inits=init, 
               n.chains = 2)
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 200
##    Unobserved stochastic nodes: 41
##    Total graph size: 489
## 
## Initializing model
update(mdl,n.iter = 5000, n.burnin=2000)
output=coda.samples(model=mdl, 
                    variable.names=c("mu","pred","sigma","sigma.trial"), 
                    n.iter=10000, thin=10)
print(summary(output))
## 
## Iterations = 5010:15000
## Thinning interval = 10 
## Number of chains = 2 
## Sample size per chain = 1000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##                Mean       SD  Naive SE Time-series SE
## mu          0.02799 0.004937 1.104e-04      1.068e-04
## pred        0.02789 0.015105 3.378e-04      3.378e-04
## sigma       0.06235 0.003421 7.649e-05      7.749e-05
## sigma.trial 0.01354 0.004994 1.117e-04      1.586e-04
## 
## 2. Quantiles for each variable:
## 
##                  2.5%      25%     50%     75%   97.5%
## mu           0.018764 0.024554 0.02782 0.03132 0.03776
## pred        -0.001158 0.019053 0.02765 0.03666 0.05902
## sigma        0.056100 0.060010 0.06219 0.06456 0.06958
## sigma.trial  0.006045 0.009846 0.01271 0.01655 0.02479
plot(output, trace = FALSE)

plot(output, density = FALSE)

cfs.lm <- c(coef(sfs.lm0), NA, sigma(sfs.lm))
cfs.lmer <- c(fixef(sfs.lmer), attr(VarCorr(sfs.lmer)$Trial_ID,"stddev"), sigma(sfs.lmer))
cfs.mcmc0 <- c(summary(sfs.mcmc.m)$statistics[1], 
               sqrt(summary(sfs.mcmc.v)$statistics[,1]))
cfs.mcmc2 <- c(summary(sfs.mcmc.m2)$statistics[1], 
               sqrt(summary(sfs.mcmc.v2)$statistics[,1]))
cfs.mcmc3 <- c(summary(sfs.mcmc.m3)$statistics[1], 
               sqrt(summary(sfs.mcmc.v3)$statistics[,1]))
cfs.rjags <- c(summary(output)$statistics[1], summary(output)$statistics[c(4,3),1])
resm <- matrix(rbind(cfs.lm,cfs.lmer, cfs.mcmc0, 
                     cfs.mcmc2, cfs.mcmc3, cfs.rjags), nrow = 6, ncol = 3,
               dimnames = list(c("lm","lmer","mcmc.nopriors","mcmc.s.priors",
                                 "mcmc.p.priors","rjags"),
                               c("mu","sigma.trial","sigma")))
kable(round(resm,5), caption = "SFS original data")
SFS original data
mu sigma.trial sigma
lm 0.02810 NA 0.06241
lmer 0.02810 0.00000 0.06297
mcmc.nopriors 0.02813 0.00259 0.06329
mcmc.s.priors 0.02771 0.01910 0.06205
mcmc.p.priors 0.02780 0.01725 0.06192
rjags 0.02799 0.01354 0.06235

Comparing confidence and prediction intervals

The difference in the prediction intervals in the following table is problematic because JAGS does not agree at all with MCMCglmm.

## Warning: 'cBind' is deprecated.
##  Since R version 3.2.0, base's cbind() should work fine with S4 objects
Comparing confidence and prediction intervals
mthd int mu lwr upr
lm conf 0.0281 0.0193 0.0369
lmer conf 0.0281 0.0194 0.0368
mcmc0 conf 0.0281 0.0193 0.0370
mcmc1 conf 0.0279 0.0181 0.0396
mcmc2 conf 0.0278 0.0180 0.0389
jags conf 0.0278 0.0188 0.0378
lm pred 0.0281 -0.0964 0.1526
lmer pred 0.0281 0.0194 0.0368
mcmc0 pred 0.0281 -0.0941 0.1515
mcmc1 pred 0.0281 -0.0971 0.1524
mcmc2 pred 0.0280 -0.0940 0.1510
jags pred 0.0277 -0.0012 0.0590

Same analysis as before but without outlier

## This line of code eliminates just one data point
## What happens is that the control was extremely high
summary(sfs[,c("TRT_Yld","CTR_Yld")])
##     TRT_Yld         CTR_Yld     
##  Min.   :34.00   Min.   :33.10  
##  1st Qu.:52.02   1st Qu.:49.20  
##  Median :59.45   Median :58.20  
##  Mean   :58.63   Mean   :57.15  
##  3rd Qu.:65.70   3rd Qu.:64.92  
##  Max.   :77.70   Max.   :77.80
print(sfs[which.min(sfs$lrr),c("Trial_ID","TRT_Yld","CTR_Yld","lrr")])
##        Trial_ID TRT_Yld CTR_Yld        lrr
## 54 ST2014IA280A    55.1    70.8 -0.2507093
## The next line gets rid of the outlier
sfs.woo <- subset(sfs, lrr > -0.2)

## Plot by averaging
sfs.a.woo <- aggregate(lrr ~ Trial_ID, data = sfs.woo, FUN = mean)
sfs.a.woo$max <- aggregate(lrr ~ Trial_ID, data = sfs.woo, FUN = max)$lrr
sfs.a.woo$min <- aggregate(lrr ~ Trial_ID, data = sfs.woo, FUN = min)$lrr

sfs.ao.woo <- sfs.a.woo[order(sfs.a.woo$lrr),]
sfs.ao.woo$trial <- 1:nrow(sfs.ao.woo)
ggplot(data = sfs.woo, aes (x = lrr, y = Trial_ID)) + geom_point() +
  ggtitle("The outlier is gone")

ggplot(data = sfs.ao.woo, aes(x = lrr, y = trial)) +
  geom_point() + xlab("lrr") +
  geom_errorbarh(aes(xmin = min, xmax = max)) + 
  geom_vline(xintercept = 0) + 
  ggtitle("Stratego YLD on Soybean")

sfs.lm0 <- lm(lrr ~ 1, data = sfs.woo)
sfs.lm <- lm(lrr ~ Trial_ID, data = sfs.woo)
sfs.lmer <- lmer(lrr ~ 1 + (1 | Trial_ID), data = sfs.woo)
set.seed(12345)
## Default priors
sfs.mcmc1.woo <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs.woo, 
                          pr = TRUE, verbose = FALSE)
sfs.mcmc2.woo <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs.woo, 
                          pr = TRUE, verbose = FALSE)
sfs.mcmc3.woo <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs.woo, 
                          pr = TRUE, verbose = FALSE)
sfs.mcmc.v.woo <- mcmc.list(sfs.mcmc1.woo$VCV, sfs.mcmc2.woo$VCV, sfs.mcmc3.woo$VCV)
sfs.mcmc.m.woo <- mcmc.list(sfs.mcmc1.woo$Sol, sfs.mcmc2.woo$Sol, sfs.mcmc3.woo$Sol)
gelman.diag(sfs.mcmc.v.woo)
## Potential scale reduction factors:
## 
##          Point est. Upper C.I.
## Trial_ID       1.34       3.06
## units          1.00       1.01
## 
## Multivariate psrf
## 
## 1.08
plot(sfs.mcmc.v.woo, density = FALSE)

set.seed(123456789)
prior1 <- list(R = list(V = 1, nu = 0.001), G = list(G1=list(V = 1, nu = 0.001)))
## These are 'non-informative' priors
sfs.mcmc1.2.woo <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs.woo, 
                      prior = prior1, pr = TRUE, verbose = FALSE)
sfs.mcmc2.2.woo <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs.woo, 
                      prior = prior1, pr = TRUE, verbose = FALSE)
sfs.mcmc3.2.woo <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs.woo, 
                      prior = prior1, pr = TRUE, verbose = FALSE)
sfs.mcmc.v2.woo <- mcmc.list(sfs.mcmc1.2.woo$VCV, sfs.mcmc2.2.woo$VCV, sfs.mcmc3.2.woo$VCV)
sfs.mcmc.m2.woo <- mcmc.list(sfs.mcmc1.2.woo$Sol, sfs.mcmc2.2.woo$Sol, sfs.mcmc3.2.woo$Sol)
gelman.diag(sfs.mcmc.v2.woo)
## Potential scale reduction factors:
## 
##          Point est. Upper C.I.
## Trial_ID          1       1.01
## units             1       1.00
## 
## Multivariate psrf
## 
## 1
plot(sfs.mcmc.v2.woo, density = FALSE)

set.seed(123456)
prior2 <- list(R = list(V = 0.0013978, nu = 1.492583), 
             G = list(G1=list(V = 0.0013978*diag(1), nu = 0.492583)))
## These are 'informative' priors
sfs.mcmc1.3.woo <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs.woo, 
                      prior = prior2, pr = TRUE, verbose = FALSE)
sfs.mcmc2.3.woo <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs.woo, 
                      prior = prior2, pr = TRUE, verbose = FALSE)
sfs.mcmc3.3.woo <- MCMCglmm(lrr ~ 1, random = ~Trial_ID, data = sfs.woo, 
                      prior = prior2, pr = TRUE, verbose = FALSE)
sfs.mcmc.v3.woo <- mcmc.list(sfs.mcmc1.3.woo$VCV, sfs.mcmc2.3.woo$VCV, sfs.mcmc3.3.woo$VCV)
sfs.mcmc.m3.woo <- mcmc.list(sfs.mcmc1.3.woo$Sol, sfs.mcmc2.3.woo$Sol, sfs.mcmc3.3.woo$Sol)
gelman.diag(sfs.mcmc.v3.woo)
## Potential scale reduction factors:
## 
##          Point est. Upper C.I.
## Trial_ID          1       1.01
## units             1       1.00
## 
## Multivariate psrf
## 
## 1
plot(sfs.mcmc.v3.woo, density = FALSE)

sfsd.woo <- list(lrr = sfs.woo$lrr, trial = sfs.woo$Trial_ID,
           t.n = length(unique(sfs.woo$Trial_ID)), 
           n = nrow(sfs.woo))
init <- list(mu = 0, tau = 0.01, tau.trial = 0.01)
modelstring="
  model {
    # Single intercept model likelihood
    for (i in 1:n) {
      lrr[i]~dnorm(mu + b[trial[i]],tau)
    }

    # trial effect 
    for(j in 1:t.n){
      b[j] ~ dnorm(0, tau.trial)
    }
    
    # priors
    mu ~ dnorm(0,0.00001) # intercept prior
    tau ~ dgamma(0.0001,0.0001) ## tau is the residual precision
    sigma <- 1.0/sqrt(tau)
    tau.trial ~ dgamma(0.0001, 0.0001)
    sigma.trial <- 1.0/sqrt(tau.trial)
    # generate predictions 
    beff ~ dnorm(0, tau.trial)
    pred <- mu + beff
  }
"
mdl=jags.model(textConnection(modelstring), data=sfsd.woo, inits=init, 
               n.chains = 2)
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 199
##    Unobserved stochastic nodes: 41
##    Total graph size: 487
## 
## Initializing model
update(mdl,n.iter = 5000, n.burnin=2000)
output=coda.samples(model=mdl, 
                    variable.names=c("mu","pred","sigma","sigma.trial"), 
                    n.iter=10000, thin=10)
print(summary(output))
## 
## Iterations = 5010:15000
## Thinning interval = 10 
## Number of chains = 2 
## Sample size per chain = 1000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##                Mean       SD  Naive SE Time-series SE
## mu          0.02964 0.004880 1.091e-04      1.121e-04
## pred        0.02983 0.014529 3.249e-04      3.249e-04
## sigma       0.05925 0.003185 7.123e-05      6.676e-05
## sigma.trial 0.01303 0.004601 1.029e-04      1.392e-04
## 
## 2. Quantiles for each variable:
## 
##                  2.5%      25%     50%     75%   97.5%
## mu          0.0204671 0.026217 0.02964 0.03289 0.03954
## pred        0.0008398 0.020687 0.02978 0.03878 0.05792
## sigma       0.0532318 0.057017 0.05909 0.06138 0.06573
## sigma.trial 0.0060456 0.009633 0.01229 0.01582 0.02340
plot(output, trace = FALSE)

plot(output, density = FALSE)

cfs.lm <- c(coef(sfs.lm0), NA, sigma(sfs.lm))
cfs.lmer <- c(fixef(sfs.lmer), attr(VarCorr(sfs.lmer)$Trial_ID,"stddev"), sigma(sfs.lmer))
cfs.mcmc0.woo <- c(summary(sfs.mcmc.m.woo)$statistics[1], 
                   sqrt(summary(sfs.mcmc.v.woo)$statistics[,1]))
cfs.mcmc2.woo <- c(summary(sfs.mcmc.m2.woo)$statistics[1], 
                   sqrt(summary(sfs.mcmc.v2.woo)$statistics[,1]))
cfs.mcmc3.woo <- c(summary(sfs.mcmc.m3.woo)$statistics[1], 
                   sqrt(summary(sfs.mcmc.v3.woo)$statistics[,1]))
cfs.rjags <- c(summary(output)$statistics[1], summary(output)$statistics[c(4,3),1])
resm <- matrix(rbind(cfs.lm,cfs.lmer, cfs.mcmc0.woo, cfs.mcmc2.woo, 
                     cfs.mcmc3.woo, cfs.rjags), nrow = 6, ncol = 3,
               dimnames = list(c("lm","lmer","mcmc.nopriors","mcmc.s.priors",
                                 "mcmc.p.priors","rjags"),
                               c("mu","sigma.trial","sigma")))
kable(round(resm,5), caption = "SFS without outlier")
SFS without outlier
mu sigma.trial sigma
lm 0.02951 NA 0.05964
lmer 0.02951 0.00363 0.05981
mcmc.nopriors 0.02947 0.00309 0.06004
mcmc.s.priors 0.02961 0.01860 0.05886
mcmc.p.priors 0.02950 0.01659 0.05895
rjags 0.02964 0.01303 0.05925

Comparing confidence and prediction intervals (without outlier)

Comparing confidence and prediction intervals
mthd int mu lwr upr
lm conf 0.0295 0.0211 0.0379
lmer conf 0.0295 0.0212 0.0378
mcmc0 conf 0.0295 0.0218 0.0385
mcmc1 conf 0.0295 0.0196 0.0395
mcmc2 conf 0.0297 0.0195 0.0397
jags conf 0.0296 0.0205 0.0395
lm pred 0.0295 -0.0890 0.1480
lmer pred 0.0295 0.0181 0.0409
mcmc0 pred 0.0292 -0.0880 0.1470
mcmc1 pred 0.0295 -0.0896 0.1451
mcmc2 pred 0.0301 -0.0867 0.1472
jags pred 0.0298 0.0008 0.0579

Tutorial about the use of the metafor package

I could include here examples using the metafor package as a way for me to learn how to do this and to see what the differences are with the methods I have been using.

soy.m1i <- aggregate(TRT1_Yld ~ Trial_ID, data = soy.o, FUN = mean)[,2]
soy.m2i <- aggregate(TRT2_Yld ~ Trial_ID, data = soy.o, FUN = mean)[,2]
soy.sd1i <- aggregate(TRT1_Yld ~ Trial_ID, data = soy.o, FUN = sd)[,2]
soy.sd2i <- aggregate(TRT2_Yld ~ Trial_ID, data = soy.o, FUN = sd)[,2]
soy.n1i <- aggregate(TRT1_Yld ~ Trial_ID, data = soy.o, FUN = length)[,2]
soy.n2i <- aggregate(TRT2_Yld ~ Trial_ID, data = soy.o, FUN = length)[,2]

rr.escl <- escalc("ROM", m1i = soy.m1i, m2i = soy.m2i, 
                         sd1i = soy.sd1i, sd2i = soy.sd2i,
                         n1i = soy.n1i, n2i = soy.n2i)
soy.rma <- rma(yi, vi, data = rr.escl, test = "knha")
forest(soy.rma)

funnel(soy.rma)

## Caterpillar plot
## Adapted from
## http://www.metafor-project.org/doku.php/plots:caterpillar_plot
### create plot
forest(rr.escl$yi, rr.escl$vi,
       xlim=c(-0.2,0.3),        ### adjust horizontal plot region limits
       subset=order(rr.escl$yi),        ### order by size of yi
       slab=NA, annotate=FALSE, ### remove study labels and annotations
       efac=0,                  ### remove vertical bars at end of CIs
       pch=19,                  ### changing point symbol to filled circle
       col="gray40",            ### change color of points/CIs
       psize=1,                 ### increase point size
       cex.lab=1, cex.axis=1,   ### increase size of x-axis title/labels
       lty=c("solid","blank"))  ### remove horizontal line at top of plot
 
### draw points one more time to make them easier to see
points(sort(rr.escl$yi), nrow(rr.escl):1, pch=19, cex=0.5)
 
### add summary polygon at bottom and text
addpoly(soy.rma, mlab="", annotate=FALSE, cex=1, row = 1, col = "red")
text(-0.1, 1, "RE Model", pos=4, offset=0, cex=1, col = "red")